For the problem of gait stability control for continuous linear walking of a biped robot, a Soft Actor-Critic (SAC) gait control algorithm based on maximum entropy Deep Reinforcement Learning (DRL) was proposed. Firstly, without accurate robot dynamic model built in advance, all parameters were derived from joint angles without additional sensors. Secondly, the cosine similarity method was used to classify experience samples and optimize the experience replay mechanism. Finally, reward functions were designed based on knowledge and experience to enable the biped robot continuously adjust its attitude during the linear walking training process, and the reward functions ensured the robustness of straight walking. The proposed method was compared with other DRL methods such as PPO (Proximal Policy Optimization) and TRPO (Trust Region Policy Optimization) in Roboschool simulation environment. The results show that the proposed method not only achieves fast and stable linear walking of the biped robot, but also has better algorithmic robustness.